transport problem
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Supervised Word Mover's Distance
Gao Huang, Chuan Guo, Matt J. Kusner, Yu Sun, Fei Sha, Kilian Q. Weinberger
Recently, a new document metric called the word mover's distance (WMD) has been proposed with unprecedented results on k NN-based document classification. The WMD elevates high-quality word embeddings to a document metric by formulating the distance between two documents as an optimal transport problem between the embedded words. However, the document distances are entirely unsupervised and lack a mechanism to incorporate supervision when available. In this paper we propose an efficient technique to learn a supervised metric, which we call the Supervised-WMD (S-WMD) metric.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Europe > Portugal > Lisbon > Lisbon (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)
- Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.88)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Entropic optimal transport beyond product reference couplings: the Gaussian case on Euclidean space
Freulon, Paul, Georgakis, Nikitas, Panaretos, Victor
The optimal transport problem with squared Euclidean cost consists in finding a coupling between two input measures that maximizes correlation. Consequently, the optimal coupling is often singular with respect to Lebesgue measure. Regularizing the optimal transport problem with an entropy term yields an approximation called entropic optimal transport. Entropic penalties steer the induced coupling toward a reference measure with desired properties. For instance, when seeking a diffuse coupling, the most popular reference measures are the Lebesgue measure and the product of the two input measures. In this work, we study the case where the reference coupling is not necessarily assumed to be a product. We focus on the Gaussian case as a motivating paradigm, and provide a reduction of this more general optimal transport criterion to a matrix optimization problem. This reduction enables us to provide a complete description of the solution, both in terms of the primal variable and the dual variables. We argue that flexibility in terms of the reference measure can be important in statistical contexts, for instance when one has prior information, when there is uncertainty regarding the measures to be coupled, or to reduce bias when the entropic problem is used to estimate the un-regularized transport problem. In particular, we show in numerical examples that choosing a suitable reference plan allows to reduce the bias caused by the entropic penalty.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Switzerland > Vaud > Lausanne (0.04)
- North America > United States > Michigan (0.04)
- Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)
Sample complexity of Schrödinger potential estimation
Puchkin, Nikita, Pustovalov, Iurii, Sapronov, Yuri, Suchkov, Denis, Naumov, Alexey, Belomestny, Denis
We address the problem of Schrödinger potential estimation, which plays a crucial role in modern generative modelling approaches based on Schrödinger bridges and stochastic optimal control for SDEs. Given a simple prior diffusion process, these methods search for a path between two given distributions $ρ_0$ and $ρ_T^*$ requiring minimal efforts. The optimal drift in this case can be expressed through a Schrödinger potential. In the present paper, we study generalization ability of an empirical Kullback-Leibler (KL) risk minimizer over a class of admissible log-potentials aimed at fitting the marginal distribution at time $T$. Under reasonable assumptions on the target distribution $ρ_T^*$ and the prior process, we derive a non-asymptotic high-probability upper bound on the KL-divergence between $ρ_T^*$ and the terminal density corresponding to the estimated log-potential. In particular, we show that the excess KL-risk may decrease as fast as $O(\log^2 n / n)$ when the sample size $n$ tends to infinity even if both $ρ_0$ and $ρ_T^*$ have unbounded supports.
- Asia > Russia (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
- Europe > Germany (0.04)
Slicing the Gaussian Mixture Wasserstein Distance
Piening, Moritz, Beinert, Robert
Gaussian mixture models (GMMs) are widely used in machine learning for tasks such as clustering, classification, image reconstruction, and generative modeling. A key challenge in working with GMMs is defining a computationally efficient and geometrically meaningful metric. The mixture Wasserstein (MW) distance adapts the Wasserstein metric to GMMs and has been applied in various domains, including domain adaptation, dataset comparison, and reinforcement learning. However, its high computational cost -- arising from repeated Wasserstein distance computations involving matrix square root estimations and an expensive linear program -- limits its scalability to high-dimensional and large-scale problems. To address this, we propose multiple novel slicing-based approximations to the MW distance that significantly reduce computational complexity while preserving key optimal transport properties. From a theoretical viewpoint, we establish several weak and strong equivalences between the introduced metrics, and show the relations to the original MW distance and the well-established sliced Wasserstein distance. Furthermore, we validate the effectiveness of our approach through numerical experiments, demonstrating computational efficiency and applications in clustering, perceptual image comparison, and GMM minimization
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
- Information Technology > Sensing and Signal Processing > Image Processing (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)